Packages to install:

library(sf)
library(sp)
library(tidyverse)
library(ggmap)
library(adehabitatHR)
library(fractaldim)

1. Read in geospatial data (GPX)

The {sf} package allows spatial objects to be stored as data frames that are easily manipulated with {dplyr}. The spatial geometry of a feature is stored in a list-column called geometry, which can be as complex as needed to store more geographic information in a single variable for each feature. The st_read( ) function can read in a number of different filetypes and automatically recognizes file suffixes. Use st_drivers( ) to print all file formats that can be read in by st_read( ).

In this example, we will be loading in GPX files containing chimpanzee group movement data collected from January to May 2019 on a handheld GPS tracker by a researcher on foot following the members of two adjacent groups of chimpanzees (Central/East and West) in Kibale National Park, Uganda.

f1 <- "PartyTracks_West.GPX"
f2 <- "PartyTracks_CentralEast.GPX"

st_layers( ) prints the layer types a file contains, along with the number of features and fields for each.

st_layers(f1)
## Driver: GPX 
## Available layers:
##     layer_name     geometry_type features fields
## 1    waypoints             Point        0     23
## 2       routes       Line String        0     12
## 3       tracks Multi Line String       81     14
## 4 route_points             Point        0     25
## 5 track_points             Point    38415     26

The layer we are interested in here is “track_points”, which we can specify with the layer argument when we read in our files with st_read( ).

west.sf <- st_read(f1, layer = "track_points")
## Reading layer `track_points' from data source `/Users/izziclark/Development/repos/ADA-GeospatialVignette-StevenRubenIzzi/PartyTracks_West.GPX' using driver `GPX'
## Simple feature collection with 38415 features and 26 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 30.38279 ymin: 0.4733126 xmax: 30.42585 ymax: 0.5199342
## CRS:            4326
eastcent.sf <- st_read(f2, layer = "track_points")
## Reading layer `track_points' from data source `/Users/izziclark/Development/repos/ADA-GeospatialVignette-StevenRubenIzzi/PartyTracks_CentralEast.GPX' using driver `GPX'
## Simple feature collection with 34994 features and 26 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 30.40663 ymin: 0.4674995 xmax: 30.44431 ymax: 0.5277813
## CRS:            4326

We can do an inital visualization of our data using the plot function. st_geometry( ) pulls only the geometry column, which can be useful for visualization, and may be required for some layer types or file formats. Our track_point data does not plot without this function.

plot(st_geometry(west.sf))

For further analysis and prettier mapping, it will be useful to extract longitude and latitude from the {sf} data frame geometry column using the st_coordinates( ) function, which stores longitude (X) and latitude (Y) in a matrix. We can stash these in a tibble containing the time stamp, longitude, and latitude of each track_point.

# For west
longlat <- st_coordinates(west.sf$geometry) # coordinates extracted from sf geometry column into matrix XY
west <- tibble(time = west.sf$time, longitude = longlat[,1], latitude = longlat[,2])
# For eastcentral
longlat <- st_coordinates(eastcent.sf$geometry)
eastcent <- tibble(time = eastcent.sf$time, longitude = longlat[,1], latitude = longlat[,2])

2. Defining home ranges using {adehabitatHR}

One thing animal movement tracking data can tell us about is home ranges. Here we explore two ways to define chimpanzee home ranges: minimum convex polygons and kernel density estimation. Both can be done with the {adehabitat} package - some helpful vignettes can be found here: (https://mran.microsoft.com/snapshot/2017-12-11/web/packages/adehabitatHR/vignettes/adehabitatHR.pdf)

2.1 Minimum convex polygons (MCP)

A minimum convex polygon is the smallest polygon around a set of points, where all interior angles are less than 180 degrees. We can create this with the mcp( ) function from the {adehabitatHR} package. First, we must convert our data to a SpatialPointsDataFrame (SPDF), which requires the {sp} package’s coordinates( ) function.

# New tibble containing id, long, lat
west.sp <- tibble(id = "west", longitude = west$longitude, latitude = west$latitude)
# Convert to SPDF by defining coordinates
coordinates(west.sp) <- ~longitude + latitude
class(west.sp) # west.sp is now an SPDF!
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"

We also need to set the correct coordinate system for our data, which were recorded in the standard WGS84 datum using geographic coordinates (longitude and latitude). UTM (Universal Transverse Mercator) is another coordinate system in which geographic locations are measured by distance (in meters) from the central meridian of the UTM zone in which it lies. Kibale National Park, Uganda, lies in Zone 36N (https://spatialreference.org/ref/epsg/wgs-84-utm-zone-36n/). We we must first tell our SPDF that it’s in WGS84 using proj4string( ), then transform our longitudes and latitudes to UTM eastings and northings (which are in meters) using spTransform( ). Every coordinate system has a standardized unique EPSG code that can be used as shorthand in the CRS( ) coordinate system function. Read more about coordinate systems and reprojecting vector data in R here: https://www.earthdatascience.org/courses/earth-analytics/spatial-data-r/reproject-vector-data/

proj4string(west.sp) <- CRS('+init=epsg:4326') # WGS84
west.sp <- spTransform(west.sp, CRS('+init=epsg:32636')) # UTM zone 36N for Kibale National Park, Uganda 

Now we’re ready to call mcp( ) to define our minimum convex polygon. The function takes a percent of points to include (vs outliers to exclude), unin for units of relocation coordinates (“km” or default “m” - which is why we had to convert from geographic coordinates to UTM), and unout for units of output area (“m2” for square meters, “km2” for square kilometers, or default “ha” for hectares).

west.mcp <- mcp(west.sp, percent = 95, unout = "km2") # exclude 5% of outlier points. 
west.mcp # 11.85 square km
## Object of class "SpatialPolygonsDataFrame" (package sp):
## 
## Number of SpatialPolygons:  1
## 
## Variables measured:
##        id     area
## west west 11.84781

When we include 95% of track points, the west group’s home range is 11.85 km2. We can visualize this by plotting it against the track points.

plot(west.sp)
plot(west.mcp, col = alpha(1:5, 0.5), add = TRUE) # set transparency with alpha

We can also see how the MCP increases in area as we include a greater percentage of points (from 50 to 100 by increments of 5) using mcp.area( ) and setting the plotit argument to true. Note that home range size is in m2, not km2.

mcp.area(west.sp, percent = seq(50, 100, by = 5), plotit = TRUE)

##          west
## 50   373.9749
## 55   413.0036
## 60   473.2108
## 65   538.9726
## 70   614.4894
## 75   665.8110
## 80   725.2620
## 85   855.6331
## 90   979.3295
## 95  1184.7809
## 100 1664.6776

And now we’ll repeat some of these steps for the central/east chimpanzee group, but combine the SPDF’s of both groups by simply using the rbind( ) function. mcp( ) then automatically uses the id column to generate a polygon for each group in the data frame. We can visualize the home ranges and their overlap by using plot() and setting the colors by id.

# Create SPDF
eastcent.sp <- tibble(id = "eastcent", longitude = eastcent$longitude, latitude = eastcent$latitude)
coordinates(eastcent.sp) <- ~longitude + latitude
# Reproject coordinate system to UTM Zone 36N
proj4string(eastcent.sp) <- CRS('+init=epsg:4326')
eastcent.sp <- spTransform(eastcent.sp, CRS('+init=epsg:32636'))
# Combine west and eastcentral SPDF's
alltracks.sp <- rbind(west.sp, eastcent.sp)
alltracks.mcp <- mcp(alltracks.sp, percent = 95, unout = "km2")
alltracks.mcp
## Object of class "SpatialPolygonsDataFrame" (package sp):
## 
## Number of SpatialPolygons:  2
## 
## Variables measured:
##                id     area
## eastcent eastcent 17.65245
## west         west 11.84781
# Plot
plot(alltracks.sp, col = as.factor(alltracks.sp@data$id))
plot(alltracks.mcp, col = alpha(1:5, 0.5), add = TRUE)

The east/central group has a larger home range (17.65 km2) than the west group (11.85 km2), which makes sense since the west group consists of about 70 chimpanzees, while east/central has about 130!

2.2 Kernel density estimation

While minimum convex polygons are a quick and easy way to visualize the bounds of animal movement, they tend to overestimate home ranges by including areas that are not actually used. A more accurate method is kernel density estimation, which maps actual utilization distribution (UD) of a habitat. A kernel uses a function to predict how likely use is for each pixel within a grid. The function includes a smoothing factor or bandwidth h, which is the distance over which a data point can influence UD - a larger h means more smoothing and an increased estimate of home range size.

We can use the kernelUD( ) function, also included in {adehabitatHR}, on our combined SpatialPointsDataFrame to generate kernels for both east/central and west chimpanzee groups, then visualize these using image( ). The default h is the “reference bandwidth” based on standard deviations of x and y coordinates and total number of relocations or points in our dataset.

alltracks.kernel <- kernelUD(alltracks.sp, h = "href") # href = the reference bandwidth
image(alltracks.kernel)

The kernels alone are difficult to interpret, but we can convert them to polygons in a SpatialPolygonsDataFrame with the getverticeshr( ) function, which takes arguments similar to mcp( ). We use the default percent of 95 so that 95% contour lines are used (95% of estimated distribution). We can plot to visualize the home range polygons of each group and use print( ) to return their areas.

kernel.poly <- getverticeshr(alltracks.kernel, percent = 95, unout = "km2")
plot(kernel.poly)

print(kernel.poly)
## Object of class "SpatialPolygonsDataFrame" (package sp):
## 
## Number of SpatialPolygons:  2
## 
## Variables measured:
##                id     area
## eastcent eastcent 15.52726
## west         west 11.27180

Notice that both kernel density estimates of area are lower than those produced by generating minimum convex polygons. East/central has decreased by over 2 km2, suggesting they utilize a smaller core area more heavily.

3. Mapping tracks and home ranges using {ggmap}

We can create nicer maps with geographic referencing using {ggmap}, a package that allows us to interact with Google Maps while building maps in a way similar to how we build plots in {ggplot2}.
Code adapted from: https://jamesepaterson.github.io/jamespatersonblog/01_trackingworkshop_formatting

We can generate a basemap using the get_stamenmap( ) function, which allows us to specify a bounding box and zoom level for our map using min and max coordinates from our dataset. First, we must transform our {sp} objects back into geographic coordinates rather than UTM.

# Transform to longlat coordinates
alltracks.spgeo <- spTransform(alltracks.sp, CRS("+proj=longlat"))
alltracks.mcpgeo <- spTransform(alltracks.mcp, CRS("+proj=longlat"))
alltracks.kgeo <- spTransform(kernel.poly, CRS("+proj=longlat"))
# Generate basemap
basemap <- get_stamenmap(bbox = c(
  left = min(alltracks.spgeo@coords[,1])-0.005,
  bottom = min(alltracks.spgeo@coords[,2])-0.005,
  right = max(alltracks.spgeo@coords[,1])+0.005,
  top = max(alltracks.spgeo@coords[,2])+0.005),
  zoom = 12)

Now we’re ready to map by building up from ggmap(basemap) with minimum convex polygons and trackpoints for each group.

# Turn the spatial data frame of points into a regular dataframe for plotting
alltracks.geo <- data.frame(alltracks.spgeo@coords, id = alltracks.spgeo@data$id)
# Map of home ranges using minimum convex polygons
map_MCP <- ggmap(basemap) + 
  # "fortify" polygon layer to add geometry to the dataframe
  geom_polygon(data = fortify(alltracks.mcpgeo), aes(long, lat, colour = id, fill = id), alpha = 0.3) + 
  geom_point(data = alltracks.geo, aes(x = longitude, y = latitude, colour = id), alpha = 0.01, pch = 20)  + # set transparency using alpha and smaller point size use pch.
  labs(x = "Longitude", y = "Latitude")
map_MCP

And we’ll do the same for our kernel density estimates!

# Map using home range polygons generated from kernel density estimates
map_KDE <- ggmap(basemap) + 
  geom_polygon(data = fortify(alltracks.kgeo), aes(long, lat, colour = id, fill = id), alpha = 0.3) + 
  geom_point(data = alltracks.geo, aes(x = longitude, y = latitude, colour = id), alpha = 0.01, pch = 20)  + 
  labs(x = "Longitude", y = "Latitude")
map_KDE

4. Fractal analysis using {fractaldim}

Finally, we’ll take a look at fractal analysis, which is one method used to quantify animal movement. A fractal analysis will output a fractal dimension (D): a ratio that describes how the details of a pattern change as the scale at which they are measured decreases. As a result, the higher the D of an animal movement pattern, the more complex the pattern is.

To demonstrate how fractal D can change, we will use the {fractaldim} package on the NgogoTrees dataset, which contains waypoints of trees the chimpanzees visited to feed, along with a timestamp of when they were first observed feeding at each tree.

#Import the data 
f3 <- "NgogoTrees.GPX"
feeding <- st_read(f3, layer = "waypoints")
## Reading layer `waypoints' from data source `/Users/izziclark/Development/repos/ADA-GeospatialVignette-StevenRubenIzzi/NgogoTrees.GPX' using driver `GPX'
## Simple feature collection with 224 features and 26 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 30.38366 ymin: 0.46838 xmax: 30.44107 ymax: 0.527326
## CRS:            4326

Let’s split up the data by time, so we can compare trees chimps visited at three different time points.

#Early timepoint
early <- feeding %>%
  filter(time <= as.Date("2019-03-15"))
earlylong <- sapply(early$geometry,"[[", 2)
earlylat <- sapply(early$geometry,"[[", 1)

#Middle timepoint
mid <- feeding %>%
  filter(time >= as.Date("2019-03-15") & time <= as.Date("2019-05-01"))
midlong <- sapply(mid$geometry,"[[", 2)
midlat <- sapply(mid$geometry,"[[", 1)

#Late timepoint
late <- feeding%>%
  filter(time >= as.Date("2019-05-01"))
latelong <- sapply(late$geometry,"[[", 2)
latelat <- sapply(late$geometry,"[[", 1)

Now use ggplot to see what these three timepoints look like:

ggplot(data=early,aes(earlylat,earlylong))+ 
  geom_point()

ggplot(data=mid,aes(midlat,midlong))+ 
  geom_point()

ggplot(data=late,aes(latelat,latelong))+ 
  geom_point()

Now we can use fractaldim to calculate a D value for each of the three timepoints.

#Early
fd.estim.dctII(cbind(earlylat,earlylong),plot.loglog=TRUE, plot.allpoints=TRUE, nlags="auto")

#Middle
fd.estim.dctII(cbind(midlat,midlong),plot.loglog=TRUE, plot.allpoints=TRUE, nlags="auto")

#Late
fd.estim.dctII(cbind(latelat,latelong),plot.loglog=TRUE, plot.allpoints=TRUE, nlags="auto")

The three timepoints vary slightly in terms of Fractal D, indicating that there may be a difference in how the animals are visiting their feeding trees over time. Given the middle timepoint had the lowest fractal D, it is possible that at that time of year the animals may be moving around from tree to tree less than at the other two times.

Conclusion

We’ve provided some examples of how to read in GPX data using {sf}, estimate animal home ranges using {sp} objects and {adehabitatHR}, map geospatial data using {ggmap}, and use fractal analysis to analyze space use patterns with {fractaldim}, but we’ve just barely scratched the surface of what can be done with geospatial data in R! Here are some more resources:

https://rspatial.org/
https://geocompr.robinlovelace.net/
https://r-spatial.github.io/sf/articles/sf1.html